Split Utterances in Dialogue: a Corpus Study

نویسندگان

  • Matthew Purver
  • Christine Howes
  • Eleni Gregoromichelaki
  • Patrick G. T. Healey
چکیده

This paper presents a preliminary English corpus study of split utterances (SUs), single utterances split between two or more dialogue turns or speakers. It has been suggested that SUs are a key phenomenon of dialogue, which this study confirms: almost 20% of utterances were found to fit this general definition, with nearly 3% being the between-speaker case most often studied. Other claims/assumptions in the literature about SUs’ form and distribution are investigated, with preliminary results showing: splits can occur within syntactic constituents, apparently at any point in the string; it is unusual for the separate parts to be complete units in their own right; explicit repair of the antecedent does not occur very often. The theoretical consequences of these results for claims in the literature are pointed out. The practical implications for dialogue systems are mentioned too.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Construction of Back-Channel Utterance Corpus for Responsive Spoken Dialogue System Development

In spoken dialogues, if a spoken dialogue system does not respond at all during user’s utterances, the user might feel uneasy because the user does not know whether or not the system has recognized the utterances. In particular, back-channel utterances, which the system outputs as voices such as“yeah”and“uh huh”in English have important roles for a driver in in-car speech dialogues because the ...

متن کامل

SAWDUST: a Semi-Automated Wizard Dialogue Utterance Selection Tool for domain-independent large-domain dialogue

We present a tool that allows human wizards to select appropriate response utterances for a given dialogue context from a set of utterances observed in a dialogue corpus. Such a tool can be used in Wizard-of-Oz studies and for collecting data which can be used for training and/or evaluating automatic dialogue models. We also propose to incorporate such automatic dialogue models back into the to...

متن کامل

I've said it before, and I'll say it again: An empirical investigation of the upper bound of the selection approach to dialogue

We perform a study of existing dialogue corpora to establish the theoretical maximum performance of the selection approach to simulating human dialogue behavior in unseen dialogues. This maximum is the proportion of test utterances for which an exact or approximate match exists in the corresponding training corpus. The results indicate that some domains seem quite suitable for a corpusbased sel...

متن کامل

The effect of context on the intelligibility of dialogue

We measured the effects of task context on the intelligibility of utterances drawn from a corpus of air traffic control dialogue. Subjects understood more words when the utterances were presented in the form of a dialogue, with the original utterance order preserved. Subjects presented with the same utterances in randomized order understood significantly fewer words. This effect was seen with b...

متن کامل

Towards Understanding Egyptian Arabic Dialogues

Labelling of user's utterances to understanding his attends which called Dialogue Act (DA) classification, it is considered the key player for dialogue language understanding layer in automatic dialogue systems. In this paper, we proposed a novel approach to user's utterances labeling for Egyptian spontaneous dialogues and Instant Messages using Machine Learning (ML) approach without ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009